摘要 :
Bare-metal instances are crucial for high-value, mission-critical applications on the cloud. Tenants exclusively use these dedicated hardware resources. Local virtualized disks are essential for bare-metal instances to provide fle...
展开
Bare-metal instances are crucial for high-value, mission-critical applications on the cloud. Tenants exclusively use these dedicated hardware resources. Local virtualized disks are essential for bare-metal instances to provide flexible and high-performance storage resources. Traditionally tenants can choose polling-based software virtualization techniques, but they consume too many valuable host CPU cores and suffer from performance degradation. Cloud vendors are hard to deploy existing hardware-assisted local storage solutions in bare-metal instances due to no access to the host OS to install customized drivers. Moreover, cloud vendors have difficulties managing and maintaining the local storage devices in bare-metal instances because hardware resources and host operating systems are completely utilized by tenants, then it will impact the availability of storage devices.This paper presents our design and experience with BM-Store, a novel high-performance hardware-assisted virtual local storage architecture for bare-metal clouds. BM-Store is transparent to the host that tenants are unaware of the underlying hardware architecture. Therefore, it can be deployed on a large scale in cloud vendors. BM-Store consists of two components: an FPGA-based BMS-Engine and an ARM-based BMS-Controller. The BMS-Engine accelerates the I/O path to enable high-performance virtual storage independent of disk devices without consuming any CPU resource on the host. The BMS-Controller is responsible for resource management and maintenance to achieve flexible and high available local storage. The results of the extensive experiments show that BM-Store can achieve near-native performance, which only introduces about 3 µs extra latency and average 4.0% throughput overhead to native disks. Compared to SPDK vhost, BM-Store achieves an average bandwidth improvement of 15.7% in microbenchmark and a maximum throughput enhancement of 13.4% in real-world applications.
收起
摘要 :
Many Java applications in data centers suffer from severe processor pipeline frontend bottlenecks, which can be mitigated by profile-guided code layout optimizations (PGCLO). To maximize optimization opportunities, state-of-the-ar...
展开
Many Java applications in data centers suffer from severe processor pipeline frontend bottlenecks, which can be mitigated by profile-guided code layout optimizations (PGCLO). To maximize optimization opportunities, state-of-the-art PGCLO solutions adopt continuous optimization to ensure that the code layout consistently matches ever-changing application control flow characteristics. However, existing continuous optimizations inevitably pause the application to execute the new code completely, which leads to high response latency and significantly deteriorates user experience.In this paper, we propose JACO, a novel profile-guided Java code layout optimizer, enabling continuous optimization without pausing application services. The key idea of JACO is to enable the execution of both the old and new code simultaneously rather than completely switching to the new code. In particular, JACO is composed of three components: (1) A lightweight profiler captures the control flow information of the application and then generates an optimized function order. (2) A control flow switcher generates new code based on optimized function order and switches the application to execute the new code without pausing the application services. (3) A selective code reclaimer only frees the memory occupied by the inactive old code. We evaluated JACO on both open-source applications and real-world applications from a world-leading company. JACO achieved up to a 16.36% performance improvement for real-world applications. The state-of-the-art approach introduces up to 37.93x latency overhead that will interrupt application services, while JACO only introduces a negligible 7% latency overhead.
收起
摘要 :
DRAM failures are one of the major hardware threats to the reliability of large-scale data centers since the uncorrectable errors in DRAMs may cause servers to shut down. Existing works try to solve this problem by predicting DRAM...
展开
DRAM failures are one of the major hardware threats to the reliability of large-scale data centers since the uncorrectable errors in DRAMs may cause servers to shut down. Existing works try to solve this problem by predicting DRAM failures in advance with Machine Learning models. In these works, correctable errors (CEs) are generally deemed as the most important feature. The major reason behind CEs' emergence is the accumulated stress caused by intensive workloads. Moreover, defective DRAMs will not manifest themselves as system errors until the defective cells are accessed by some specific workloads. Therefore, the running workloads on a server are also important for DRAM failure prediction. In this paper, we focus on the impact of workloads on DRAM failures. We design the workload features from both macroscopical and microscopical aspects, i.e. node-level performance metrics and cell-level DRAM access pattern, respectively. Furthermore, we propose Hierarchical DRAM Error Code (HiDEC) to represent the DRAM access pattern. We leverage several Decision Tree-based models for DRAM failure prediction to highlight the generality of our designed features. Experiments are carried out based on the dataset collected from a real-world commercial data center. The results show that both macroscopic and microscopic features can bring significant improvements to the prediction performance.
收起
摘要 :
DRAM failures are one of the major hardware threats to the reliability of large-scale data centers since the uncorrectable errors in DRAMs may cause servers to shut down. Existing works try to solve this problem by predicting DRAM...
展开
DRAM failures are one of the major hardware threats to the reliability of large-scale data centers since the uncorrectable errors in DRAMs may cause servers to shut down. Existing works try to solve this problem by predicting DRAM failures in advance with Machine Learning models. In these works, correctable errors (CEs) are generally deemed as the most important feature. The major reason behind CEs' emergence is the accumulated stress caused by intensive workloads. Moreover, defective DRAMs will not manifest themselves as system errors until the defective cells are accessed by some specific workloads. Therefore, the running workloads on a server are also important for DRAM failure prediction. In this paper, we focus on the impact of workloads on DRAM failures. We design the workload features from both macroscopical and microscopical aspects, i.e. node-level performance metrics and cell-level DRAM access pattern, respectively. Furthermore, we propose Hierarchical DRAM Error Code (HiDEC) to represent the DRAM access pattern. We leverage several Decision Tree-based models for DRAM failure prediction to highlight the generality of our designed features. Experiments are carried out based on the dataset collected from a real-world commercial data center. The results show that both macroscopic and microscopic features can bring significant improvements to the prediction performance.
收起
摘要 :
Disk and memory faults are the leading causes of server breakdown. A proactive solution is to predict such hardware failure at the runtime and then isolate the hardware at risk and backup the data. However, the current model-based...
展开
Disk and memory faults are the leading causes of server breakdown. A proactive solution is to predict such hardware failure at the runtime and then isolate the hardware at risk and backup the data. However, the current model-based predictors are incapable of using the discrete time-series data, such as the values of device attributes, which conveys high-level information of the device behavior. In this paper, we propose a novel deep-learning based prediction scheme for system-level hardware failure prediction. We normalize the distribution of samples’ attributes from different vendors to make use of diverse training sets. We propose a temporal Convolution Neural Network based model that is insensitive to the noise in the time dimension. Finally, we design a loss function to train the model with extremely imbalanced samples effectively. Experimental results from an open S.M.A.R.T data set and an industrial data set show the effectiveness of the proposed scheme.
收起
摘要 :
Disk and memory faults are the leading causes of server breakdown. A proactive solution is to predict such hardware failure at the runtime and then isolate the hardware at risk and backup the data. However, the current model-based...
展开
Disk and memory faults are the leading causes of server breakdown. A proactive solution is to predict such hardware failure at the runtime and then isolate the hardware at risk and backup the data. However, the current model-based predictors are incapable of using the discrete time-series data, such as the values of device attributes, which conveys high-level information of the device behavior. In this paper, we propose a novel deep-learning based prediction scheme for system-level hardware failure prediction. We normalize the distribution of samples’ attributes from different vendors to make use of diverse training sets. We propose a temporal Convolution Neural Network based model that is insensitive to the noise in the time dimension. Finally, we design a loss function to train the model with extremely imbalanced samples effectively. Experimental results from an open S.M.A.R.T data set and an industrial data set show the effectiveness of the proposed scheme.
收起
摘要 :
Eight prestressed steel reinforced concrete beam (PSRCB) and six steel reinforced concrete beam (SRCB) have been done with different compressive strength of concrete, rebar ratio ,cover thickness, steel cover thickness and degree ...
展开
Eight prestressed steel reinforced concrete beam (PSRCB) and six steel reinforced concrete beam (SRCB) have been done with different compressive strength of concrete, rebar ratio ,cover thickness, steel cover thickness and degree of prestressing. A compared study on the mechanical properties between PSRCB and SRCB were carried out, which included flexural capacity, deflection and maximum crack width. Experimental study indicates that PSRCB has better mechanical performance, higher flexural capacity, larger stiffness and better resistance to crack.
收起